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Many  recent  studies  have  investigated  the  role  of  syntactic  and 
semantic  factors  in  listeners'  comprehension  of  continuous  speech. 
The  evidence  is  compelling  and  unambiguous  in  revealing  the  importance 
of  top-down  processes  in  this  situation  (Cole,  1980).  Despite  this, 
only  recently  have  investigators  examined  the  possibility  that  similar 
top-down  processing  occurs  in  the  perception  of  some  complex  nonspeech 
sound  patterns.  Everyday  experience  suggests  that  expectancies  are 
involved  in  our  ability  to  decipher  the  variety  of  nonspeech  stimuli 
we  encounter.  More  specialized  listening  skills  can  also  be  seen  in 
some  individuals.  For  example,  a  sonar  technician  must  identify  the 
source  and  activity  represented  by  the  sounds  recorded  on  passive 
sonar  hydrophones.  Such  sounds  often  occur  in  sequences  or  patterns 
in  which  the  temporal  structure  (sound  order)  can  provide  important 
cues  about  what  the  source  vessel  may  be  doing  (Howard  &  Balias, 
1980).  Similarly,  the  technician's  extensive  knowledge  of  the  sources 
producing  these  sounds  appears  to  influence  his  or  her  perceptual 
capability.  The  present  study  reports  two  experiments  which 
investigate  the  role  of  syntactic  and  semantic  factors  in  the 
classification  of  complex,  nonspeech  patterns. 

Several  investigators  have  presented  evidence  that  top-down 
processing  does  occur  for  nonspeech  stimuli  (Deutsch,  1980;  Howard  & 
Balias,  1980;  Bregman,  1981).  For  example,  Bregman  (1981)  has 
demonstrated  that  listeners  employ  Gestalt  rules  or  principles  to 
segment  acoustic  information  from  different  sources.  He  has  referred 
to  this  phenomenon  as  auditory  streaming. 

Research  in  our  laboratory  has  suggested  that  listeners  are  able 
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to  use  both  syntactic  and  semantic  information  in  a  relatively  simple 
auditory  pattern  classification  task  (Balias  &  Howard,  1980;  Howard  & 
Balias,  1980).  Four  groups  of  listeners  were  required  to  classify 
patterns  of  brief  duration  real-world  sounds  as  "targets"  or 
"nontargets."  For  the  grammatical  group  target  patterns  were  produced 
using  a  simple  finite-state  rule  structure  or  grammar  to  determine  the 
sequential  order  of  the  pattern  components.  In  contrast,  target 
patterns  for  the  nongrammatical  group  matched  the  grammatical  patterns 
in  length,  but  were  randomly  constructed.  As  a  result,  the  target  set 
for  the  latter,  nongrammatical  group  lacked  the  overall  coherence  or 
structure  present  in  the  grammatical  target  set. 

In  addition,  each  of  the  five  pattern  components  was  related  to 
water  or  steam  (e.g.,  the  squeak  of  a  valve  turning,  steam  hiss,  water 
flushing  down  a  drain,  etc.),  and  the  finite-state  grammar  was 
selected  to  produce  only  interpretable  patterns.  In  other  words,  the 
grammar  reflected  the  temporal  structure  of  possible  real  world 
events.  For  example,  one  pattern  might  represent  someone  taking  three 
turns  to  open  a  valve  that  releases  steam,  which,  in  turn,  causes 
pipes  to  clang.  Each  of  the  grammatical  patterns  could  be  described 
by  a  similar  source  scenario  corresponding  to  events  that  might 
actually  occur.  Although  similar  interpretations  obviously  can  be 
applied  to  some  randomly  constructed  patterns,  overall  the 
nongrammatical  target  patterns  were  only  minimally  interpretable.  To 
evaluate  the  role  of  semantic  information  in  nonspeech  pattern 
classification,  one-half  of  the  participants  in  each  of  the  two  groups 
were  read  a  brief  paragraph  which  suggested  a  theme  for  the  patterns 
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they  would  hear.  The  paragraph  was  general  and  did  not  identify  any 
specific  patterns.  The  remaining  individuals  received  no  explicit 
semantic  information  about  the  patterns. 

Three  findings  were  reported  (Howard  &  Balias,  1980)  .  First, 
listeners  who  classified  the  grammatical  target  patterns  performed 
substantially  better  than  those  who  classified  the  unstructured  or 
random  target  patterns.  Second,  listeners  in  the  grammatical  group 
who  received  semantic  information  performed  at  a  higher  level  than 
those  who  did  not  receive  this  information.  Third,  the  semantic 
information  did  not  enhance  performance  for  listeners  in  the 
nongrammatical  group.  On  the  contrary,  there  was  some  evidence  that 
the  semantic  information  actually  impaired  performance  for  these 
listeners.  Overall,  it  was  concluded  that  syntactic  and  semantic 
factors  interact  to  influence  nonspeech  pattern  classification.  The 
two  experiments  reported  in  the  present  paper  replicate  (Experiment  1) 
and  extend  (Experiment  2)  these  findings. 

Experiment  1^ 

In  this  experiment  participants  were  required  to  classify 
patterns  of  complex,  water-  and  steam-related  sounds.  As  in  our 
previous  research  some  individuals  received  structured  or  grammatical 
target  patterns  whereas  others  received  an  unstructured  or  randomly 
generated  target  set.  One-half  of  the  individuals  in  each  condition 
were  read  a  general  descriptive  passage  which  suggested  a  water/steam 
semantic  context  for  the  patterns  they  were  to  hear.  The  remaining 
participants  were  given  no  explicit  semantic  information.  Overall, 
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the  experiment  is  a  replication  of  our  earlier  study;  however,  a 
six-point  rating  scale  procedure  was  employed  to  obtain  a  full 
receiver  operating  characteristic  (ROC)  for  each  listener. 

Method 

Participants.  Twenty  student  volunteers,  five  in  each  of  four 
groups,  were  paid  to  participate  in  the  experiment. 

Stimul i .  Five  brief-duration,  "real-world"  sounds  were  recorded 
in  our  laboratory  (a  radiator  valve  being  turned,  water  drip, 
broadband  steam  hiss,  the  clang  of  a  metal  object  striking  a  radiator 
pipe,  and  water  flushing  down  a  drain) .  The  sounds  were  digitized 
using  standard  signal  processing  techniques  with  a  10-bit 
analog-to-dig ital  converter  at  a  12.5  kHz  sampling  rate.  Each  sound 
was  320  ms  in  duration  with  the  exception  of  the  water  drip  which  was 
82  ms  long. 

The  grammatical  target  patterns  were  produced  using  the  simple 
finite-state  grammar  described  in  Howard  &  Balias  (1980,  p.  432). 
Twelve  grammatical  patterns  ranging  in  length  from  four  to  six  events 
(three,  four,  and  five  patterns  of  each  length,  respectively)  were 
selected  to  make  up  the  grammatical  target  category.  A  corresponding 
nongrammatical  target  set  was  produced  by  randomly  permuting  the  order 
of  pattern  components  in  the  grammatical  target  set.  Consequently, 
the  nongrammatical  targets  matched  the  grammatical  targets  in  length 
and  composition.  Similarly,  48  randomly  constructed  nontarget 
patterns  were  selected  to  be  nonoverlapping  with  the  target  sets  but 
to  match  them  in  length.  Each  component  was  presented  at  a 
comfortable  listening  level  and  the  individual  components  were 
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separated  by  510  ms  within  the  patterns. 

Apparatus .  All  experimental  events  were  controlled  by  a 
general-purpose  laboratory  computer.  The  acoustic  patterns  were 
output  on  a  12-bit  digital-to  analog  converter  at  a  sampling  rate  of 
12.5  kHz,  low-pass  filtered  at  5  kHz  (Khron-Hite  Model  3550)  , 
attenuated,  and  presented  binaurally  over  matched  Telephonies  TDH-49 
headphones  with  MX-41/AR  cushions.  Testing  was  done  individually  in  a 
sound-attenuated  booth  and  listeners  indicated  their  responses  by 
pressing  buttons  on  a  solid-state  keyboard.  A  video  display  was  also 
located  in  the  booth. 

Procedure .  Participants  were  told  that  they  would  be  hearing 
patterns  of  several  sounds  presented  very  quickly.  They  were  told 
that  some  of  the  patterns  were  designated  as  targets  and  that  their 
task  would  be  to  pick  out  the  targets.  Although  the  participants  were 
told  that  the  targets  and  nontargets  would  occur  equally  often,  no 
information  was  provided  regarding  the  composition  of  the  target  set. 
The  six  point  rating  scale  was  also  explained  (1  =  definitely  a 
nontarget,  2  =  probably  a  nontarget,  3  =  possibly  a  nontarget, 
4  =  possibly  a  target,  5  =  probably  a  target,  6  =  definitely  a 
target) .  Participants  in  the  semantic  conditions  were  also  read  the 
following  paragraph  before  beginning:  "All  of  the  individual  sounds 
relate  to  water  and  steam.  You  will  hear  such  things  as  drips,  water 
flushing  down  a  drain,  a  valve  being  turned  on,  steam  escaping,  and 
radiator  pipes  clanging." 

Each  trial  began  when  the  word  "LISTEN"  appeared  on  the  video 
screen.  A  response  prompt,  the  six  scale  descriptors,  was  presented 
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immediately  after  the  test  pattern.  The  listener  then  responded  by 
pressing  a  key  on  the  keypad  (a  digit  between  "1"  and  "6")  ,  and  verbal 
feedback  was  presented  visually  following  the  response.  After  an 
intertrial  interval  of  1.5  s,  the  screen  was  erased  and  the  next  trial 
began.  There  were  96  trials  in  each  test  block,  four  presentations  of 
each  of  the  12  targets  and  48  presentations  of  nontargets.  Listeners 
were  tested  for  12  blocks  over  three  successive  days. 

Immediately  after  the  last  test  block,  listeners  were  told  that 
the  target  patterns  had  been  constructed  using  a  set  of  rules — like 
the  rules  of  language.  It  was  explained  that  they  would  be  hearing  a 
new  set  of  patterns  and  that  their  task  would  be  to  classify  each 
pattern  using  the  six-point  rating  scale.  They  then  completed  an 
additional  block  of  96  trials  as  before,  but  without  feedback.  The 
grammatical  patterns  presented  in  this  test  block  were  produced  by  the 
same  finite-state  grammar  used  for  the  grammatical  target  patterns; 
however,  they  were  not  presented  as  targets  previously.  This  block 
was  included  to  determine  whether  the  participants  could  classify 
novel  grammatical  patterns.  The  participants  were  interviewed  and 
debriefed  before  leaving. 

Results  £  Discussion 

A  ROC  function  was  determined  from  the  rating-scale  data  for  each 
participant  on  each  test  block  (Swets,  1979).  A  nonparametric, 
response-bias-free  index  of  performance  was  then  computed  by 
determining  the  area  under  the  ROC  using  a  trapezoidal  algorithm. 
Mean  areas  were  calculated  for  each  condition  by  averaging 
across  individuals.  The  mean  ROC  area  for  each  of  the  four  groups  is 
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plotted  by  blocks  in  Figure  1. 


Insert  Figure  1  here 


Consider  first  the  effect  of  syntactic  pattern  structure  on 
classification  performance.  It  is  evident  from  Figure  1  that 
participants  in  the  two  grammatical  groups  (mean  ROC  area  =  .81) 
performed  at  a  considerably  higher  level  than  did  those  in  the  two 
nongrammatical  groups  (mean  ROC  area  =  .60) .  This  finding  is 
consistent  with  our  earlier  result  and  indicates  that  individuals  are 
able  to  use  the  underlying  temporal  structure  of  target  patterns  to 
facilitate  classification. 

The  effect  of  semantic  information  is  more  subtle.  In  our 
previous  experiment  semantic  information  led  to  a  slight,  but 
consistent  overall  improvement  in  classification  performance  for 
individuals  receiving  structured  target  patterns.  It  is  apparent  from 
Figure  1  that  in  the  present  experiment,  overall  performance  is  very 
similar  for  the  two  grammatical  groups  (mean  ROC  areas  of  .82  and  .80 
for  the  semantic  and  no-semantic  conditions,  respectively) . 
Nevertheless,  it  is  interesting  to  note  that  on  each  of  the  first  six 
test  blocks,  individuals  who  were  provided  with  explicit  semantic 
information  outperformed  those  who  received  no  explicit  information. 
This  trend  is  consistent  with  our  previous  finding.  It  suggests  that 
in  the  present  experiment  any  effect  of  explicit  semantic  information 
had  disappeared  by  the  seventh  test  block. 

Two  explanations  exist  for  this  result.  First,  it  is  possible 


Figure  1.  Mean  ROC  area  by  block  for  the  grammatical  (G)  and 
nongrammatical  groups  (NG) ,  with  (S)  and  without  (NS)  explicit 
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that  with  experience  individuals  in  the  no-semantic  condition  were 
able  to  develop  their  own  labels  and  descriptive  scenarios  for  the 
target  patterns.  This  hypothesis  seems  likely  given  the  small  number 
of  different  sounds  employed  here  (five)  and  the  relative  familiarity 
of  these  sounds.  Individuals  in  the  semantic  condition  were  provided 
with  an  appropriate  semantic  framework  for  the  patterns  initially  and 
consequently  began  performance  at  a  higher  level.  Second,  it  is  also 
possible  that  the  convergence  of  the  two  groups  reflects  simple 
artifact — a  ceiling  effect.  Although  theoretically,  the  ROC  area  can 
reach  a  value  of  1.00  with  perfect  performance,  most  individuals 
showed  surprisingly  stable  asymptotic  performance  across  the  final  six 
test  blocks. 

Another  outcome  of  our  previous  experiment  was  that  individuals 
who  classified  unstructured  target  patterns  did  not  benefit  from  the 
explicit  semantic  information  we  provided.  A  similar  finding  is 
evident  for  the  nongrammatical  groups  in  the  present  study.  Mean 
overall  performance  was  very  similar  for  the  two  groups  (mean  areas  of 
.59  and  .61  for  the  semantic  and  no-semantic  conditions, 
respectively).  Furthermore,  inspection  of  Figure  1  reveals  no 
systematic  differences  between  the  two  nongrammatical  groups. 

Mean  ROC  areas  were  also  computed  for  the  final  test  block  on 
which  participants  classified  novel  grammatical  patterns  without 
feedback.  The  purpose  of  these  trials  was  to  determine  if  individuals 
could  generalize  their  knowledge  of  the  target-pattern  structure  to 
previously  untested  grammatical  patterns.  Both  grammatical  groups 
performed  substantially  above  the  .50  chance  level  (mean  ROC  area  of 
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.73  and  .87  for  the  semantic  and  no-semantic  conditions, 
respectively) ,  whereas  the  two  nongrammatical  groups  responded  at 
chance  level  (mean  ROC  areas  of  .51  and  .53  for  the  semantic  and 
no-semantic  conditions,  respectively) .  These  findings  are  consistent 
with  our  previous  work  and  support  the  argument  that  participants  in 
the  grammatical  groups  learn  something  about  the  underlying  structure 
of  the  target  set  during  classification.  That  is,  they  learn 
something  more  general  than  simple  paired-associate  responses  to 
individual  patterns.  Since  individuals  in  the  nongrammatical  groups 
were  not  exposed  to  structured  target  stimuli,  it  is  obvious  that  they 
would  not  be  expected  to  perform  any  better  than  chance  on  this  block. 

Experiment  2 

In  general,  the  findings  of  Experiment  1  are  consistent  with  our 
earlier  results  (Howard  &  Balias,  1980)  in  revealing  that  syntactic 
and  semantic  factors  interact  in  an  important  way  to  influence 
classification  performance  for  complex  nonspeech  patterns.  The 
grammatical  and  nongrammatical  conditions  investigated  in  Experiment 
1,  however,  represent  two  extreme  conditions.  On  the  one  hand,  the 
target  patterns  are  both  interpretable  and  have  an  underlying  temporal 
structure  (grammatical  conditions) ,  whereas  on  the  other  hand  the 
target  patterns  were  not  interpretable  nor  did  they  have  any 
underlying  syntactic  or  temporal  structure.  Consequently,  the 
syntactic  effects  demonstrated  in  the  previous  experiment  can  be 
attributed  either  to  pattern  interpretability  or  to  the  presence  of  an 
underlying  temporal  structure.  In  the  present  experiment,  two 


PAGE  11 

additional  conditions  were  tested  to  address  this  issue. 
Specifically,  grammatical  target  patterns  were  employed  in  a  target 
classification  task,  however,  the  patterns  were  constructed  so  as  to 
be  generally  uninterpretable.  This  was  accomplished  by  permuting  the 
assignment  of  pattern  component  sounds  to  the  output  of  the 
finite-state  grammar  used  in  Experiment  1.  In  other  words,  the  target 
patterns  were  grammatically  structured,  but  unlike  those  used  in  the 
first  experiment,  they  were  not  consistently  interpretable.  As 
before,  one-half  the  participants  received  general  semantic 
information  regarding  the  sounds. 

Method 

Participants .  Ten  student  volunteers,  five  in  each  of  two 
groups,  were  paid  to  participate  in  the  experiment.  No  individual  had 
participated  in  the  previous  experiment. 

Stimul i .  The  same  five  "real-world"  sounds  and  finite-state 
grammar  used  in  Experiment  1  were  used  to  produce  the  stimulus 
patterns.  The  assignment  of  component  sounds  to  the  output  codes 
produced  by  the  grammar  was  determined  randomly  to  minimize 
interpretability  of  the  target  patterns.  The  nontarget  patterns  were 
generated  as  in  Experiment  1. 

Apparatus.  Same  as  Experiment  1. 

Procedure.  Same  as  Experiment  1. 

Results  &  Discussion 

A  ROC  area  was  determined  for  each  individual  on  each  block  as  in 
Experiment  1.  The  mean  ROC  area  is  plotted  as  a  function  of  block  for 
the  two  semantic  conditions  in  Figure  2. 
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Insert  Figure  2  here 


Two  results  are  evident  in  Figure  2.  First,  individuals  in  both 
semantic  information  conditions  showed  a  consistent  improvement  in 
performance  with  practice.  Indeed,  their  performance  more  closely 
approximates  that  of  the  grammatical  group  in  Figure  1  than  that  of 
the  nongrammatical  group.  The  mean  overall  performance  level  observed 
in  this  experiment  (mean  ROC  area  of  .80)  is  virtually  identical  to 
the  mean  performance  level  obtained  for  the  grammatical  group  in 
Experiment  1  (mean  ROC  area  of  .81).  Since  the  target  patterns 
employed  in  the  present  experiment  were  structured,  but  interpretable, 
this  suggests  that  performance  depends  more  on  syntactic  pattern 
structure  than  on  pattern  interpretability  per  se. 

Second,  despite  the  similarities  already  noted,  important 
differences  also  exist  between  the  present  results  and  those  of 
Experiment  1.  In  particular,  individuals  in  the  no-semantic  condition 
outperformed  those  in  the  semantic  condition  on  all  but  a  single  block 
in  the  present  experiment  (block  5) .  This  pattern  differs  from  that 
observed  in  our  previous  research  (Howard  &  Balias,  1980)  and  in 
Experiment  1  for  individuals  classifying  grammatically  structured 
targets.  It  appears  that  the  general  semantic  information  we  provided 
actually  interfered  with  classification  performance  for  structured, 
but  uninterpretable  patterns.  This  result  suggests  that  when  working 
with  target  patterns  that  are  generally  uninterpretable,  specific 
thematic  information  may  inappropriately  lead  individuals  to  search 


uninterpretable  groups  of  Experiment  2,  with  (S)  and  without  (NS) 
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for  sensible  interpretations  o£  the  patterns  where  none  exist.  This 
misguided  search  may  actually  make  the  task  more  difficult.  A  similar 
tendency  for  semantic  information  to  impair  performance  with 
uninterpretable,  nongrammatical  patterns  was  noted  in  our  previous 
research  (Howard  &  Balias,  1980). 

Mean  ROC  areas  were  also  computed  for  the  last,  no-feedback  test 
block  with  novel  grammatical  patterns.  Individuals  in  both  semantic 
conditions  performed  considerably  above  chance  as  expected  (mean  ROC 
areas  of  .77  and  .81  for  the  semantic  and  no-semantic  conditions, 
respectively) . 


General  Discussion 

The  results  of  the  present  experiments  are  consistent  with  our 
earlier  findings  in  revealing  that  both  syntactic  (sequential 
structure)  and  semantic  factors  can  play  a  role  in  nonspeech  pattern 
classification.  In  particular,  Experiment  1  demonstrated  that  when 
listeners  are  required  to  classify  sequentially  structured, 
interpretable  target  patterns  they  are  able  to  use  this  information  to 
facilitate  the  task.  Furthermore,  the  results  of  Experiment  2  showed 
similarly  high  performance  for  listeners  who  classified  structured, 
but  minimally  interpretable  patterns.  This  suggests  that  pattern 
structure  rather  than  interpretability  per  se  is  largely  responsible 
for  the  enhanced  performance  observed  in  Experiment  1.  The  finding 
that  listeners  in  the  structured  or  grammatical  groups  successfully 
generalized  their  knowledge  to  classify  novel  grammatical  patterns  on 
a  post-experimental  test  block  is  consistent  with  the  argument  that 


< 
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listeners  are  able  to  learn  general  characteristics  of  the  pattern 
structure . 

The  present  experiments  also  showed  that  explicit  semantic 
information  can  iaad  to  enhanced  classification  performance — at  least 
on  initial  trials — when  the  target  patterns  are  consistently 
interpretable.  On  the  other  hand,  when  the  target  set  is  not 
interpretable,  explicit  semantic  information  leads  to  no  improvement 
(Experiment  1)  or  to  a  performance  decrement  (Experiment  2) .  As 
suggested  in  our  earlier  paper  (Howard  &  Balias,  1980),  it  is  likely 
that  providing  explicit  semantic  information  leads  listeners  to  search 
for  a  consistent  semantic  interpretation  for  the  target  pattern  set. 
When  no  consistent  interpretation  exists,  the  classification  task  can 
actually  become  more  difficult. 

In  general,  it  is  clear  that  further  work  is  needed  to  clarify 
the  role  of  both  syntactic  and  semantic  factors  in  the  perception  of 
complex  nonspeech  patterns.  Such  factors  are  likely  to  play  an 


important  role  in  understanding  the  comprehension  of  everyday  sounds 
as  well  as  specialized  tasks  such  as  that  of  the  sonar  technician. 
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